Term frequency-Inverse
document frequency (tf-idf) is a numerical statistic which reflects the
importance of words in a document collection. It is often used in information
retrieval system. The number of times a word appears in the document (word
frequency) is one of the major factors to acquire tf-idf.
You are asked to
write a program to find the most frequent word in a document.
Input. The first line contains an integer n (1 ≤ n
≤ 1000) which determines
the number of words. The following n lines include the list of words,
one word per line. A word contains only lower-case letters
and it can contain up to 20 characters.
Output. Print the
word that has the highest frequency and its frequency, separated by a single
space. If you get two or more results, choose one that comes later in the
lexicographical order.
Sample
input |
Sample
output |
10 mountain lake lake zebra tree lake zebra zebra animal lakes |
zebra 3 |
data structures - map
Algorithm analysis
Declare a data structure map<string, int> m, that will count how many times
each word occurs. Then find the word that occurs the most times. If there are
several such words, then print lexicographically the largest.
Algorithm realization
In the map m count how many times each word
appears.
map<string,int> m;
Read and count the words.
cin >> n;
for (i = 0; i < n; i++)
{
cin >> s;
m[s]++;
}
Find the word res that occurs the most times mx.
mx = 0;
for (auto x : m)
If several words occur the maximum number of times, then store in res the
lexicographically largest word.
if (x.second >= mx)
{
mx = x.second;
res = x.first;
}
Print the word
with the maximum frequency and the frequency itself.
cout << res << " " << mx << endl;
Java realization
import java.util.*;
public class Main
{
public static void main(String[] args)
{
Scanner con = new Scanner(System.in);
TreeMap<String, Integer> tree = new
TreeMap<String, Integer>();
int test = con.nextInt();
while(test-- > 0)
{
String s = con.next();
tree.put(s, tree.getOrDefault(s, 0) + 1);
}
int max = -1;
String res = null;
for(String s : tree.keySet())
{
int n = tree.get(s);
if (n >= max)
{
res = s;
max = n;
}
}
System.out.println(res + " " + max);
con.close();
}
}